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Abstract. Metric coinduction is a form of coinduction that can be used to establish 
properties of objects constructed as a limit of finite approximations. One can prove a 
coinduction step showing that some property is preserved by one step of the approximation 
process, then automatically infer by the coinduction principle that the property holds of the 
limit object. This can often be used to avoid complicated analytic arguments involving 
limits and convergence, replacing them with simpler algebraic arguments. This paper 
examines the application of this principle in a variety of areas, including infinite streams, 
Markov chains, Markov decision processes, and non-well-founded sets. These results point 
to the usefulness of coinduction as a general proof technique. 



Mathematical induction is firmly entrenched as a fundamental and ubiquitous proof 
principle for proving properties of inductively defined objects. Mathematics and computer 
science abound with such objects, and mathematical induction is certainly one of the most 
important tools, if not the most important, at our disposal. 

Perhaps less well entrenched is the notion of coinduction. Despite recent interest, 
coinduction is still not fully established in our collective mathematical consciousness. A 
contributing factor is that coinduction is often presented in a relatively restricted form. 
Coinduction is often considered synonymous with bisimulation and is used to establish 
equality or other relations on infinite data objects such as streams [20] or recursive types 



In reality, coinduction is far more general. For example, it has been recently been 
observed [H] that coinductive reasoning can be used to avoid complicated e-5 arguments 
involving the limiting behavior of a stochastic process, replacing them with simpler alge- 
braic arguments that establish a coinduction hypothesis as an invariant of the process, then 
automatically deriving the property in the limit by application of a coinduction principle. 
The notion of bisimulation is a special case of this: establishing that a certain relation is a 
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bisimulation is tantamount to showing that a certain coinduction hypothesis is an invariant 
of some process. 

Coinduction, as a proof principle, can handle properties other than equality and in- 
equality and extends to other domains. The goal of this paper is to explore some of these 
applications. We focus on four areas: infinite streams, Markov chains, Markov decision 
processes, and non-well-founded sets. In Section [2] we present the metric coinduction prin- 
ciple. In Section [3] we illustrate the use of the principle in the context of infinite streams 
as an alternative to traditional methods involving bisimulation. In Sections [3J and [5l we 
rederive some basic results of the theories of Markov chains and Markov decision processes, 
showing how metric coinduction can simplify arguments. Finally, in Section [6j we use met- 
ric coinduction to derive a new characterization of the hereditarily finite non-well-founded 
sets. 



2. Coinduction in Complete Metric Spaces 

2.1. Contractive Maps and Fixpoints. Let (V, d) be a complete metric space. A func- 
tion H : V — > V is contractive if there exists < c < 1 such that for all u, v £ V, 
d(H(u),H(v)) < c ■ d(u,v). The value c is called the constant of contraction. A contin- 
uous function H is said to be eventually contractive if H n is contractive for some n > 1. 
Contractive maps are uniformly continuous, and by the Banach fixpoint theorem, any such 
map has a unique fixpoint in V. 

The fixpoint of a contractive map H can be constructed explicitly as the limit of a 
Cauchy sequence u, H(u), H 2 (u), . . . starting at any point u G V. The sequence is Cauchy; 
one can show by elementary arguments that 

d(H n+m (u),H n (u)) < c n (l -c m )(l -c) -1 -d(H(u),u). 

Since V is complete, the sequence has a limit u* , which by continuity must be a fixpoint of 
H. Moreover, u* is unique: if H(u) = u and H(v) = v, then 

d(u,v) = d(H(u),H(v)) < c- d(u,v) => d(u,v)=0, 

therefore u = v. 

Eventually contractive maps also have unique fixpoints. If H n is contractive, let u* be 
the unique fixpoint of H n . Then H{u*) is also a fixpoint of H n . But then d(u* , H(u . )) = 
d(H n (u*), H n+l (u*)) < c ■ d{u* , H(u*)), which implies that u* is also a fixpoint of H. 

2.2. The Coinduction Rule. In the applications we will consider, the coinduction rule 
takes the following simple form: If ip is a closed nonempty subset of a complete metric 
space V, and if H is an eventually contractive map on V that preserves ip, then the unique 
fixpoint u* of H is in ip. Expressed as a proof rule, this says for ip a closed property, 

3u (p(u) Vu (p(u) => <p(H(u)) 

?(«*) ' ( } 

In [13], the rule was used in the special form in which V was a Banach space (normed linear 
space) and H was an eventually contractive linear affine map on V. 
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2.3. Why Is This Coinduction? We have called (12. lh a coinduction rule. To justify 
this terminology, we must exhibit a category of coalgebras and show that the rule (|2.1|) is 
equivalent to the assertion that a certain coalgebra is final in the category. This construction 
was given in [14], but we repeat it here for completeness. 

Say we have a contractive map H on a metric space V and a nonempty closed subset 
V Q y preserved by H. Define H((p) = {H(s) \ s £ (p}. Consider the category C whose 
objects are the nonempty closed subsets of V and whose arrows are the reverse set inclusions; 
thus there is a unique arrow tpi — > ip 2 iff (fi 5 ip%. The map H defined by H(<p) = cl(H(ip)), 
where cl denotes closure in the metric topology, is an endofunctor on C, since H(<p) is a 
nonempty closed set, and <p\ D ip 2 implies H((pi) 3 H(ip 2 ). An 5-coalgebra is then a 
nonempty closed set (p such that ip D H(<p); equivalently, such that (p D H{ip). The final 
coalgebra is {w*}, where u* is the unique fixpoint of H. The coinduction rule (|2.ip says 
that <p D H(ip) =4> ip D {w*}, which is equivalent to the statement that {«*} is final in the 
category of -ff-coalgebras. 



3. Streams 

Infinite streams have been a very successful source of application of coinductive tech- 
niques. The space 5s = (S^, head, tail) of infinite streams over E is the final coalgebra in the 
category of simple transition systems over S, whose objects are (X, obs, cont), where X is a 
set, obs : X — > S gives an observation at each state, and cont : X — > X gives a continuation 
(next state) for each state. The unique morphism (X, obs, cont) — > head, tail) maps a 
state s € X to the stream obs(s), obs(cont(s)), obs(cont 2 (s)), . . . € S^. 

We begin by illustrating the use of the metric coinduction principle in this context as an 
alternative to traditional methods involving bisimulation. It is well known that «Ss forms 

a complete metric space under the distance function d(a,r) = 2 _n , where n is the first 
position at which a and r differ. The metric d satisfies the property 

\ld(a,r), ifx = y 



d(x :: a,y :: r) 



1, if x ^ y. 



One can also form the product space «S|. with metric 

d((cri,o- 2 ),{T 1 ,T 2 )) d = maxd(<7i,Ti), d(a 2 ,T 2 ). 

Since distances are bounded, the spaces of continuous operators S|. — > 5s and 5s — > 5| 
are also complete metric spaces under the sup metric 

d(E,F) = supd(E(x),F(x)). 

X 

Consider the operators merge : <S S — > 5s and split : 5s — ► defined informally by 

merge (a aia 2 • • ■ , b bib 2 • • • ) = a b aibia 2 b 2 ■ ■ ■ 

split (aoa\a 2 • • • ) = (aoa 2 a4 • • • , 010305 • • • ). 

Thus merge forms a single stream from two streams by taking elements alternately, and 
split separates a single stream into two streams consisting of the even and odd elements, 
respectively. 
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Formally, one would define merge and split coinductively as follows: 
merge (x :: a, r) = f x :: merge (r, a) 
split (x :: y :: a) d = (x :: split (<r)i, y :: split {0)2)- 

These functions exist and are unique, since they are the unique fixpoints of the eventually 
contractive maps 

a : {Si -> 5 E ) - (S 2 - 5 E ) /? : (5 S -> 5|) - (5 S - 5|) 

defined by 

a(M)(x :: cr, r) d = x :: M(r, a) 

0{S){x :: y :: a) = (x S(a) u y :: S(a) 2 ). 

We would like to show that merge and split are inverses. Traditionally, one would 
do this by exhibiting a bisimulation between merge (split (cr)) and a, thus concluding that 
merge (split (cr)) = a, and another bisimulation between split (merge (cr, r)) and (a, r), thus 
concluding that split (merge (cr, r)) = (cr, r). 

Here is how we would prove this result using the metric coinduction rule (|2,ip . Let 
M : Sj* — > 5s and 5" : 5s — ► 5|;. If M is a left inverse of 5, then a 2 (M) is a left inverse of 

p(sy. 

a 2 (M)(P(S)(x :: y :: a)) = a(a(M))(x :: S(a) u y :: S(cr) 2 ) 

= x :: a(M)(y :: S(a) 2 , 5(<r)i) 
= x::y::M(5(<7)i, 5(<r) 2 ) 
= x :: y :: M(5(cr)) 
= x :: y :: a. 

Similarly, if M is a right inverse of 5, then a 2 (M) is a right inverse of (3(S): 
f3(S)(a 2 (M)(x :: <r, y :: r)) = (5(S)(a(a(M))(x :: a, y :: r)) 

= /?(5)(x :: a(M)(y :: r, a)) 
= /?(S)(x :: y :: M(a, r)) 
= (x :: S(M(a, r))i, y :: S(M(a, r)) 2 ) 
= (x :: (cr, t)i, y :: (a, r) 2 ) 
= (x :: cr, y :: r). 

We conclude that if M and 5 are inverses, then so are a 2 (M) and f3(S). 
The property 

ip(M,S) 4=4* M and 5 are inverses (3.1) 

is a nonempty closed property of (5|. — » 5s) x (5s — > «S£) which, as we have just shown, is 
preserved by the contractive map (M,S) 1— > (a 2 (M), (3(S)). By (|2.ip . ^ holds of the unique 
fixpoint (merge, split). 

That is nonempty and closed requires an argument, but these conditions typically 
follow from general topological considerations. For example, (|3.ip is nonempty because the 
spaces 5s and 5| are both homeomorphic to the topological product of countably many 
copies of the discrete space S. 
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4. Markov Chains 

A finite Markov chain is a finite state space, say {1, . . . , n}, together with a stochastic 
matrix P E M. nxn of transition probabilities, with P s t representing the probability of a 
transition from state s to state t in one step. The value P™ is the probability that the 
system is in state t after m steps, starting in state s. 

A fundamental result in the theory of Markov chains is that if P is irreducible and 
aperiodic (definitions given below), then P™ tends to 1/nt as m — > oo, where m is the mean 
first recurrence time of state t, the expected time of first reentry into state t after leaving 
state t. Intuitively, if we expect to be in state t about every (j, t steps, then in the long run 
we expect to be in state t about 1/fM of the time. 

The proof of this result as given in Feller [TO] is rather lengthy, involving a complicated 
argument to establish the uniform convergence of a certain countable sequence of countable 
sequences. The complete proof runs to several pages. Introductory texts devote entire 
chapters to it (e.g. [12]) or omit the proof entirely (e.g. [H]). In this section we show that, 
assuming some basic spectral properties of stochastic matrices, the coinduction rule can be 
used to give a simpler alternative proof. 

4.1. Spectral Properties. Recall that P is irreducible if its underlying support graph is 
strongly connected. The support graph has vertices {1, . . . ,n} and directed edges {(s,t) \ 
P s t > 0}. A directed graph is strongly connected if there is a directed path from any vertex 
to any other vertex. The matrix P is aperiodic if in addition, the gcd of the set {m \ P™ > 0} 
is 1 for all states s. By the Perron-Frobenius theorem (see [H[T6]), if P is irreducible and 
aperiodic, then P has eigenvalue 1 with multiplicity 1 and all other eigenvalues have norm 
strictly less than 1. 

The matrix P is itself not contractive, since 1 is an eigenvalue. However, consider the 
matrix 

1 T 
P- -11 T , 

n 

where 1 is the column vector of all l's and T denotes matrix transpose. The matrix ^H T 
is the n x n matrix all of whose entries are 1/n. 

The spectra of P and P — ^H T are closely related, as shown in the following lemma. 

Lemma 4.1. Let P € R nxri be a stochastic matrix. Any (left) eigenvector x T of P — ^H T 
that lies in the hyperplane x T l = is also an eigenvector of P with the same eigenvalue, and 
vice-versa. The only other eigenvalue of P is 1 and the only other eigenvalue of P — ^H T 
is 0. 

Proof. For any eigenvalue A of P and corresponding eigenvector x T , 

Ax T l = x T Pl = x T l 

since PI = 1, so either A = 1 or x T l = 0. Similarly, for any eigenvalue A of P — ^H T and 
corresponding eigenvector x T , 

Ax T l = a; T (P--ll T )l = x T l-x T l = 0, 

n 

so either A = or x T l = 0. But if x T l = 0, then 

x T (P - -11 T ) = x T P - -j; t 11 t = x T P, 
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so in this case x T is an eigenvector of P iff it is an eigenvector of P — ^H T with the same 
eigenvalue. □ 

4.2. Coinduction and the Convergence of P m . If P is irreducible and aperiodic, then 

P — ^H T is eventually contractive, since inf n \J\\{P — ^ll T ) ra || is equal to the spectral 

radius or norm of the largest eigenvalue of P — -11 T (see [9]), which by Lemma l4.1l is less 
than 1. Thus the map 

n n 

is of the proper form to be used with the metric coinduction rule (|2.ip to establish the 
convergence of P m . 

Since P — ^H T is eventually contractive, the map (|4.ip has a unique fixpoint u T . The 
set of stochastic vectors 

S = {x T | x T > 0, x T l = 1} 
is closed and preserved by the map (|4.ip . since 

x T l = l =4> x T (P- -11 T ) + -1 T = x T P, 

n n 

and 5 is preserved by P. By the metric coinduction rule (12. ip . the unique fixpoint u T is 
contained in S*. By Lemma l4.lt it is also an eigenvector of 1, and y T P m tends to it T for any 
y T S S. Applying this to the rows of any stochastic matrix E, we have that EP m converges 
to the matrix lu T . 

4.3. Recurrence Statistics. Once we have established the convergence of P m , we can 
give a much shorter argument than those of [10\ I12j that the actual limit of is 1/pt- We 
follow the notation of |10j . 

Fix a state t, and let p, = m. Let f m be the probability that after leaving state t, 
the system first returns to state t at time m. Let u m = P^ be the probability that the 
system is in state t at time m after starting in state t. By irreducibility, Y2m=i fm = l an d 

p = J2m=i m fm < oo. Let p m = YlkLm+i A> an d consider the generating functions 



oo oo 
dcf r m , \ def \ -> m 

tLjyiX 



f( X ) Y: fmX m U(X) ^ £ 

m=l m=0 

oo oo 

/ % dcf \ m / \ def , \ / \ m _i_i 

p[x) = 2^ PmX a(x) = u + } j (u m+ i - u m )x . 

m=0 m=0 

The probabilities u n obey the recurrence 

n-l 

^0 — 1 — ^ ^ ^mfn—mi 

m=0 

which implies that f(x)u(x) = u{x) — 1. Elementary algebraic reasoning gives 

a{x)p{x) = 1. (4.2) 
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Now we claim that both <r(l) and p(l) converge. The sequence p(l) converges to fi > 0, 
since 

oo oo 
m=l m=l 

and the latter sequence in (|4.3p converges absolutely. For <r(l), we have 

oo 

= ^0 + y^(^m+l ~ Urn), 

which converges by the results of Section B~2l By (|4.2p . cr(l)p(l) = 1, therefore cr(l) = 
But the rath partial sum of <r(l) is just uo + J2k"=o ( u k+i ~ u k) = u m , so the sequence u m 
converges to 

5. Markov Decision Processes 

In this section, we rederive some fundamental results on Markov decision processes using 
the metric coinduction principle. A fairly general treatment of this theory is given in [8], 
and we follow the notation of that paper. However, the strategic use of metric coinduction 
allows a more streamlined presentation. 

5.1. Existence of Optimal Strategies. Let V be the space of bounded real-valued func- 
tions on a set of states fi with the sup norm ||u|| ^= f sup^g^ \v {x)\. The space V is a complete 
metric space with metric ||u — u||. 

For each state x G fi, say we have a set A x of actions. A deterministic strategy is an 

element of A = H xg Q A x , thus a selection of actions, one for each state x € fi. More 
generally, if A x is a measurable space, let M(A X ) denote the space of probability measures 
on A x . A probabilistic strategy is an element of n^eo M(A X ), thus a selection of probability 
measures, one for each x E fi. A deterministic strategy can be viewed as a probabilistic 
strategy in which all the measures are point masses. 

Now suppose we have a utility function h : riien^ 31 * ^ ~ * ^0 the three proper- 
ties listed below0 The function h induces a function H such that Hg(u)(x) = h(x, S x ,u) € R, 
where x € fi, 6 G A, and u G V. 

(i) The function H is uniformly bounded as a function of 5 and x. That is, H$ : V — ► V, 
and for any fixed u G V, sup 5eA ||i?,s(it)|| is finite. 

(ii) The functions H$ are uniformly contractive with constant of contraction c < 1. That 
is, for all 6 G A and u,v G V, \\Hs(v) — Hg(u)\\ < c- \\v — u\\. Thus H$ has a unique 
fixpoint, which we denote by vs. 

(iii) Every H$ is monotone: if u < v, then Hs(u) < Hs(v). The order < on V is the 
pointwise order. 

Lemma 5.1. Define A : V —>■ V by A(u)(x) = f sup rfgAa , h(x,d,u). The supremum exists 
since the Hs are uniformly bounded. Then A is contractive with constant of contraction c. 



We write h{x,8 x ,u) instead of h(x)(8 x )(u) for readability. 
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Proof. Let e > 0. For assuming without loss of generality that A(v)(x) > A(u)(x), 

\A(v){x) -A{u){x)\ 
= sup h(x,d,v) — sup h(x,e,u) 

deA x e&A x 

< e + h(x, d, v) — sup h(x, e, u) for suitably chosen d € A x 

e£A x 

< e + h(x,d,v) — h(x,d,u) 

< e + c • \\v — u\\. 

Since e was arbitrary, \A(v)(x) — A(u)(x)\ < c ■ \\v — u\\, thus 

\\A(v) - A(u)\\ < sup \A(v)(x) - A{u)(x)\ < c ■ \\v - u\\. 

x 

□ 

Since A is contractive, it has a unique fixpoint v* . 
Lemma 5.2. For any 5, v$ < v* . 

Proof. By the coinduction principle, it suffices to show that u < v implies Hg(u) < A(v). 
Here the metric space is V 2 , the closed property 93 is u < v, and the contractive map is 
(H$, A). But if u < v, then by monotonicity, 

H$(u)(x) < Hs(v)(x) = h(x, 5 X , v) < sup h(x, d, v) = A(v). 

d£A x 



□ 



Lemma 5.3. The fixpoint v* can be approximated arbitrarily closely by vg for deterministic 
strategies 5. 

Proof. Let e > 0. Let 5 be such that for all x, 

sup h(x, d, v ) — h(x, S x , v*) < (1 — c)e. 

d£A x 



We will show that ||t> — vg\\ < e. By the coinduction rule (|2.ip . it suffices to show that 
\\v* — u\\ < e implies \\v* — Hg(u)\\ < e. Here the metric space is V, the closed property 
<p(u) is \\v — u\\ < e, and the contractive map is Hg. But if ||t> * — u|| < e, 

\\v*-H 5 (u)\\ = sup\v*(x)-H s (u)(x)\ = sup\A(v*)(x)-H s (u)(x)\ 

X X 

= sup I sup h(x,d,v*) — h(x,S x ,u)\ 

x d£A x 

< sup (| sup h(x, d, v *) — h(x, S x , v )\ + \h(x, S x , v ) — h(x, S x , u)\) 

x deA x 

< (1 — c)s + c • ||f * — n|| < (1 — c)e + ce = e. 

□ 
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5.2. Probabilistic Strategies. We use the metric coinduction rule (12. ip to prove the well- 
known result that for Markov decision processes, probabilistic strategies are no better than 
deterministic strategies. If sup dg A h(x,d,v ) is attainable for all x, then the determinis- 
tic strategy 5 X = f argmax dgAi /i(x, d, v*) is optimal, even allowing probabilistic strategies. 
However, if sup rfgAa , h(x,d,v*) is not attainable, then it is not so obvious what to do. 

For this argument, we assume that A x is a measurable space and that for all fixed 
x and u, h(x,d,u) is an integrable function of d G A^. Given a probabilistic strategy 
fi : M^A^), the one-step utility function is : V — > V defined by the Lebesgue 

integral 



H,{u){x) 



def 



h(x, d, u) ■ n x (Ad). 



deA x 



This integral accumulates the various individual payoffs over all choices of d weighted by 
the measure fj, x . 

The map H^u) is uniformly bounded in fj,, since 



sup 

x 



h(x, d, u) ■ fi x (Ad) 



deA x 



< 



sup 



\h(x, d,u)\ ■ fj, x (Ad) 



x Jd£A x 



< supsup \h(x, d, u)\ ■ / /j, x (Ad) = sup \h(x, d, u)\. 

x d Jd£A x x,d 



It is also a contractive map with constant of contraction c, since 

= sup \Hfj,(v)(x) - H^(u){x)\ 



sup 



sup 

X 



d£A x 



h(x, d, v) ■ fi x (Ad) 



d£A x 



h(x, d, u) • fi x (Ad) 



(h(x, d, v) — h(x, d, u)) ■ fi x (Ad) 



d£A x 



< 



sup / \h(x,d,v) — h(x,d,u)\ ■ fi x (Ad) 

x JdeA x 



< sup 



\v — u\\ ■ fi x (Ad) 



d&A x 



\v — u\\ ■ sup 



fJ-x(Ad) 



x JdeA x 



\v — u\ 



Since it is a contractive map, it has a unique fixpoint v^. 

Now take any deterministic strategy 5 such that h(x,5 x ,v^) > v^(x) for all x. This is 
always possible, since if h(x,d,v^) < v^x) for all d £ A x , then 



H^{v^){x) 



h(x,d,Va) ■ Hx{Ad) < vJx), 



deA x 



a contradiction. The following lemma says that the deterministic strategy 5 is no worse 
than the probabilistic strategy /u. 



Lemma 5.4. v$ > v u 
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Proof. Assuming <v, we have 

Vfj,(x) < h(x,5 x ,v^) < h(x,6 x ,v) = H$(v)(x), 

the second inequality by monotonicity. As x was arbitrary, u„ < H§(v). The result follows 
from the coinduction principle on the metric space V with ip{v) the closed property v „ < v 
and contractive map Hg. □ 



6. Non-Well-Founded Sets 

In classical Zermelo-Fraenkel set theory with choice (ZFC), the "element of" relation 6 
is well-founded, as guaranteed by the axiom of foundation. Aczel [2] developed the theory 
of non-well-founded sets, in which sets with infinitely descending €-chains are permitted in 
addition to the well-founded sets. These are precisely the sets that are explicitly ruled out 
of existence by the axiom of foundation. 

In the theory of non- well- founded sets, the sets are represented by accessible pointed 
graphs (APGs). An APG is a directed graph with a distinguished node such that every 
node is reachable by a directed path from the distinguished node. Two APGs represent the 
same set iff they are bisimilar. The APGs of well-founded sets may be infinite, but may 
contain no infinite paths or cycles, whereas the APGs of non-well-founded sets may contain 
infinite paths and cycles. Equality as bisimulation is the natural analog of extensionality 
in ZFC; essentially, two APGs are declared equal as sets if there is no witness among their 
descendants that forces them not to be. The class V is the class of sets defined in this way. 

Aczel [2J (see also [U [21] ) notes the strong role that coinduction plays in this theory. 
Since equality between APGs is defined in terms of bisimulation, coinduction becomes a 
primary proof technique for establishing the equivalence of different APGs representing the 
same set. 

In attempting to define a metric on non-well-founded sets, the classical Hausdorff dis- 
tance suggests itself as a promising candidate. This metric has been previously defined for 
the hereditarily finite well-founded sets and their completion, the finitary non-well-founded 
sets, by Abramsky pQ. For the more general case of arbitrary non- well- founded sets, there 
are two complications. One is that we must apply the definition coinductively. Another 
is that ordinarily, the Hausdorff metric is only defined on compact sets, since otherwise a 
Hausdorff distance of zero may not imply equality, and that is the case here. However, the 
definition still makes sense even for non-compact sets and leads to further insights into the 
structure of non- well- founded sets. 

In this section, we define a distance function d : V 2 — > M. based on a coinductive 
application of the Hausdorff distance function and derive some properties of d. We show 
that (V, d) forms a compact pseudometric space. Being a pseudometric instead of a metric 
means that there are sets s/i with d(s, t) = 0. Nevertheless, we identify a maximal family 
of sets that includes all the hereditarily finite sets on which d acts as a metric. 

We will prove the following results. Define s ~ t if d(s, t) = 0. Call a set s singular if 
the only t such that s ~ t is s itself. 

• A set is singular if and only if it is hereditarily finite. 

• All singular sets are closed in the pseudometric topology. In particular, all hereditarily 
finite sets are hereditarily closed (but not vice- versa). 

• A set is hereditarily closed if and only if it is closed and all elements are singular. 
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• All hereditarily closed sets are canonical (but not vice- versa), where a set is canonical if 
it is a member of a certain coinductively-defined class of canonical representatives of the 
^-classes. 

• The map d is a metric on the canonical sets; moreover, the canonical sets are a maximal 
class for which this is true. 



6.1. Coinductive Definition of Functions. Just as classical ZFC allows the definition 
of functions by induction over ordinary well-founded sets, there is a corresponding principle 
for non- well-founded sets known as the Solution Lemma [21 [21]. In particular, the Solution 
Lemma implies that for any function H : V — > V, the equation 

G(s) = {G(u) | u G H(s)} (6.1) 

determines G : V — > V uniquely. This is because if G and G' both satisfy (|6.ip . then the 
relation 

uRv 44 3s u = G(s) A v = G'(s) 

is a bisimulation, therefore G(s) = G'(s) for all s. In coalgebraic termg^, the map G is the 
unique morphism from the coalgebra (V,{(s,t) \ s € H(t)}) to the final coalgebra (V, G); 
see [H Chp. 7] or [2H Part V]. 



6.2. Definition of d. Let B be the Banach space of bounded real-valued functions g 



APG 



with norm 



def 



sup|g(s,i)|. 

s,t 



Define the map r : B — > B by 


1 



r(9)(s,t) 



def 



2 max 



sup„ et inf u£s g(u,v) 



if s,t = 

ifs = 0^i^0 

if s,t ^ 0. 



thus r is contractive on B with constant 



It can be shown that ||r(g) — t(</)|| < ^Wo ~ 9' ... 
of contraction 1/2 and has a unique fixpoint d € B. One can therefore use the metric 
coinduction rule (|2.ip to prove properties of d. 

To illustrate, let us show that the non-well-founded sets V form a compact (thus com- 
plete) pseudometric space with respect to the distance function d. At the outset, it is not 
immediately clear that d is well-defined on V. We must argue that d is invariant on bisim- 
ulation classes; that is, for any bisimulation R, if s R s' and t R t', then d(s,t) = d(s',t'). 
We will use the metric coinduction rule (|2.ip to prove this. 

Consider the following closed property on B, defined with respect to an arbitrary but 
fixed bisimulation R on the class of APGs: 

ip( g ) 44 VsVs' ViW' s Rs' AtRt' => g(s,t) = g(s',t'). 



2 When regarding V as a coalgebra, the notation (V, G) is a slight but convenient abuse. Formally, these 
structures are coalgebras with respect to the powerset functor V. To be precise, we should write (V, j3), 
where /3 : V — > W and write s £ /3(t) instead of s 6 t. 
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This property is closed in the metric topology on B, since it is an infinite conjunction of 
closed properties g(s,t) = g(s',t'), one for each selection of s,s',t,t' such that s R s' and 
t Rt' . It is clearly nonempty. We wish to prove that <p{d). By the metric coinduction rule 
(|2.ip . it suffices to show that ip is closed under r. 

Suppose ip(g). We want to show that ip(r(g)), or in other words, 

VsVs' VtVt' s Rs' At Rt' r(g)(s,t) = r(g)(s' ,t'). 

Let s, s', t, t' be such that s R s' and t R t' . Since R is a bisimulation, we have 

Vu £ s 3u £ s' u Ru Vu £ s' 3u £ s u Ru' 

Vv £ t 3v' £ t' v R v' W £ t' 3v £ t v R v' . 

It follows that s = iff s' = and t = iff t' = 0. If s = s' = 0, then 

f if t, t' = ) 

r(g)(s,t) = I , } = T(g)(s',t'). 

A symmetric argument holds if t = t' = 0. 

Otherwise, all four sets s,s',t,t' are nonempty. In this case, 

T(q)(st) = imaxl ™p u£s mi v& g(u,v) 
2 \ sup vet inf ues g(u, v) 

r(g)(s',t') = imaxl ^Pw^ ^ ' & , g(u' v') 

2 | sup u , et ,inf u / 6s /3(n / ,u / )> 

so it suffices to show that 

sup inf g(u, v) = sup inf g(u',v) (6-2) 

ues vet u r eg iv'et' 

sup inf g(u, v) = sup inf g(u\v'). (6-3) 

net «e* „/ gt /M'es' 

We show only (|6.2[) : the argument for (|6.3[) is symmetric. Also by symmetry, we need only 
show the inequality in one direction: 

sup inf g(u, v) < sup inf g(u\v'). 

ues v & u i ea iv'et' 

This inequality follows from the property 

Vu £ s 3u' £ s' m£g(u,v) < inf g(u',v'), 

v& v'et' 

which in turn follows from 

Vu £ s 3v! £ s' W £ t' 3v £ t g(u, v) < g(u' , v'). 

In fact, we have 

Vn £ s 3u' £ s' W £ t' 3v £ t g(u, v) = g(u', v') 

by choosing v! £ s' such that u R u' and v £ t such that v R v', as guaranteed by the 
coinduction hypothesis and the fact that R is a bisimulation. 

We conclude by the metric coinduction principle (|2,ip that (p{d) holds, thus d is invariant 
on the equivalence classes of any bisimulation R on APGs, therefore well-defined on V . 

To show that d is a pseudometric, we must also show 

d(s, t)>0 (in fact, d(s, t) £ [0, 1]) d(s, t) = d(t, s) 

d(s, u) < d(s, t) + d(t, u) d(s, s) = 0. 
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All these properties can be shown in the same way, by metric coinduction. One need only 
argue that they are all nonempty closed properties closed under r. 

We will establish compactness (hence completeness) later in section 16.41 but first we 
introduce the canonical sets. 

6.3. Canonical Sets. The map d is only a pseudometric and not a metric, since it is 
possible that d(s,t) = even though s ^ t. For example, define = 0, n + 1 = {n}. 
Let £1 be the unique non-well-founded set such that f2 = {0}. The sets {n | n > 0} and 
{n | n > 0} U O are distinct, but distance apart (Fig. [[]). This follows from the observation 
that d(n,n) = 2~ n , so 

sup inf d(u, v) = inf d(n, U) = 0. 

v£{n\n>0}Ufl u&{n\n>0} n>0 

Nevertheless, it is possible to relate this map to the coalgebraic structure of V. 



{n | n > 0} {n \ n > 0} U Q 

Figure 1: Distinct sets of distance 

The map d defines a pseudometric topology with basic open neighborhoods {t | d(s, t) < 
e} for each set s and e > 0, but because d is only a pseudometric, the topology does not 

have nice separation properties. However, if we define s ~ t 4=4> d(s, t) = 0, then d is 
well-defined on ^-equivalence classes and is a metric on the quotient space. 

More interestingly, we can identify a natural class of canonical elements, one in each 
~-class, such that d, restricted to canonical elements, is a metric; moreover, the canonical 
elements are a maximal class for which this is true. Thus the quotient space is isometric 
to the subspace of canonical elements. The canonical elements include all the hereditarily 
finite sets. 

The canonical elements are defined as the images of the function F : V — > V, defined 
coinductively as follows: 

F( s ) d ^ {F(u) | u G cl(s)}, (6.4) 

where cl denotes closure in the pseudometric topology. The equation (|6.4|) determines F 
uniquely, as with (|6.ip . A set s is called canonical if s = F(t) for some t; equivalently, by 
Corollary 16.3( h) below, if s is a fixpoint of F. For example, the right-hand side of Fig. [T] is 
F applied to the left-hand side, and the set on the right-hand side is canonical. 



Lemma 6.1. d(s,t) = iff cl(s) = cl(t). 
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Proof. If s = t = 0, then both sides are true. If exactly one of s, t is 0, then both sides are 
false. Finally, if both s,t ^ 0, then 

d(s,t) = 44> supinf d(u, v) = A sup inf d(u, v) = 
ues v & vet «es 

44> Vu G s Ve > 3v G t d(u, v) < e A Vu G t Ve > 3u £ s d(u, v) < e 

<S> s C cl(i) A t C cl(s) 

44> cl(s) = cl(t). 

□ 

Theorem 6.2. 

(i) Ifd(s,t) = 0, then F(s) = F(t). 

(ii) For all s, d(s,F(s)) = 0; that is, s -F(s). 

Proo/. 

(i) By Lemma ISTTl if d(s,t) = 0, then cl(s) = cl(t), and the conclusion F(s) = is 
immediate from (|6.4|) . 

(ii) We proceed by coinduction on the definition of d. We strengthen the coinduction 
hypothesis g(s,F(s)) = with the two extra assertions that < g(s,t) < d(s,t) and 
that g satisfies the triangle inequality. We wish to show that this combined property 
holds of r(g) under the assumption that it holds of g. 

That < T(g)(s,t) < r(d)(s,t) = d(s,t) is clear from the coinduction hypothesis 
and the monotonicity of the operators in the definition of r. The argument that r(g) 
satisfies the triangle inequality is equally straightforward. Thus it remains to show 
that T(g)(s,F(s)) = 0. 

By definition of F, s = iff F(s) = 0, and in this case r(g)(s, F(s)) = by 
definition of r. Otherwise s / and F(s) ^ 0. To show r(g)(s, F(s)) = in this 
case, we need to show that 

sup inf g(u, v) = sup inf g(u,F(w)) = 0, 
ues v£F(s) ues tu6cl(s) 

sup inf g(u, v) = sup inf g(u, F(w)) = 0. 
v( z F ( s )ues wed(s) ues 

It suffices to show 

Vn G s inf g(u,F(w))=0, \/w G cl(s) inf g(u, F(w)) = 0. 
iugci(s) «es 

For the former, we can take w = u; then the result follows from the coinduction 
hypothesis g(u,F(u)) = 0. For the latter, let w G cl(s). Here we use all three clauses 
of the coinduction hypothesis: 

inf g(u, F(w)) < inf g(u, w) + g(w, F(w)) < inf d(u, w) + = 0, 
ues ues ues 

the last equation from the fact that w G cl(s). □ 
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Corollary 6.3. 

(i) d(s,t) = OijfF(s) = F(t). 

(ii) For alls, F(F(s)) = F(s). 

(iii) Every « -equivalence class contains exactly one canonical set, and d restricted to canon- 
ical sets is a metric. Moreover, the canonical sets are a maximal class for which this 
is true. 



6.4. Compactness. For the results of section [675| we need to show that the space of non- 
well-founded sets is compact under d, thus complete. We will show that every infinite set 
has a limit point. Define the equivalence relations ~ n inductively by: 

def 

s «o * for all s, t s ~ n +i t <^=^ Vu € s 3v € t u ss„ v 

A Vf € t 3u G s u « n v. 

Also define inductively 

where 2 A denotes the powerset of A. Each S n is a well-founded hereditarily finite set. For 
n > 0, define the map f n : V — > S n+ i inductively by 

f (s) A = /n+lO) = {fn{u) | U £ s}. 

The following properties of S n , « n , and / n are easily established by induction on n. 

Lemma 6.4. For all s,t S V and m, n > 0, 

(i) / n 0) G Sn+i; 

(ii) i/sG S'n+i then f n (s) = s; 

(iii) s w n / n (s); 

(iv) /n(/m( , s)) — /min m,n( , s); 

(v) if s,t & Sn+i and s « n t, i/ien s = t. 

Lemma 6.5. For a// s,i E V and n > 0, the following are equivalent: 

(i) s« n t; 

(ii) /„(*) = /„(*); 

(iii) d(s,t) < 2" n . 

For each s G V, let /(s) denote the sequence fo(s), fi(s), /2(s), • • •• It follows from 
Lemma l6.4f iv) that f n (fn+i( s )) = fn(s). Moreover, we have the following representation 
theorem as converse: 

Lemma 6.6. Any sequence sq, si, S2, ■ ■ ■ such that / n (s n +i) = s n for all n > is f(s) for 
some s. 

Proof. Let W be the set of all sequences s = Sq, s%, S2, ■ ■ ■ such that s n = f n (s n+ i), n > 0. 
This is a set, since the defining condition implies s n £ S n+ \. Consider the system with 
nodes W and edges N defined by 

def 

u N s Vn > u n G s n j_i. 
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We claim that f n (s) = s n . The proof is by induction on n. Certainly /o(s) = = so, since 
s o = fo(si) G Si and is the only element of Si. Now suppose the claim is true for n. 
Then 

/n+l(s) = {fn(u) \ UEW, UN s} = {f n (u) \ U G W, Vk U k £ S k+ i} 

= {u n \u£ W, Vk u k G s k+ i} = s n+ i. 

The last equation requires that for all a G 5 n+ i, there exists u G W such that u n = a. The 
sequence u = fo(a), fi(a), /2(a), • • • does it. □ 

Lemma 6.7. The space V is compact under the pseudometric d, therefore complete. 

Proof. We wish to show that every infinite set s has a limit point p (not necessarily contained 
in s). Let W be the tree of all sequences Uo, ui, 112, ■ ■ ■ such that f n (u n +i) = u n for all n > 
as defined in the proof of Lemma [6.61 This is a finitely branching, infinite tree with root 0. 
By Konig's lemma, there is an infinite path p in W such that for every node p n on the path, 
there are infinitely many u G s such that f n (u) = p n . The set represented by the path p 
as given by Lemma 16.61 is the desired limit point, since for all k, there exist infinitely many 
u G s such that f k {u) = p k = f k (p), therefore d(u,p) < 2~ k . □ 



6.5. Hereditarily Finite Sets Are Canonical. Let ip be a property of sets. We define a 
set to be hereditarily ip (Hy>) if it has an APG representation in which every node represents 
a set satisfying ip. Equivalently, Hip is the largest solution of 

£[99(3) 4=4> ip(s) A Vu G s H(p(u). 

The hereditarily finite (HF) sets are those possessing an APG representation in which 
every node has finite out-degree (not necessarily bounded). Note that this differs from 




Figure 2: /(0) 

Aczel's definition [21 p. 7]. Aczel defines a set to be hereditarily finite if it has a finite APG, 
which is a much stronger condition. Aczel's definition and ours coincide for well-founded 
sets by Konig's lemma, but not for non-well-founded sets in general. For example, the set 
/(0), where / is defined coinductively by f(n) = {n, f(n + 1)} (Fig. [2]), is hereditarily finite 
in our sense but not Aczel's. We would prefer the term regular or rational for sets that 
are hereditarily finite in Aczel's sense, since they are exactly the sets that have a regular or 
rational tree representation [6]. 

A set is hereditarily closed (HC) if it has an APG representation in which every node 
represents a closed set in the pseudometric topology. Recall that a set is singular if it forms 
a singleton «-class. 

Lemma 6.8. If s is singular, then all elements of s are singular. Thus all singular sets are 
hereditarily singular. 
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Proof. Suppose u G s, v ^ u, and d(u,v) = 0. We claim that (i) if v €" s, then d(s, s U 
{v}) = 0, and (ii) if v G s, then d(s, s — {i>}) = 0, thus in either case, s is not singular. 
In case (i), we must show 

sup inf d(x, y) = 0, sup inf d(x, y) = 0. 

x&s y£sU{v} y£sU{v} x&s 

It suffices to show 

Vx G s 3y G s U {?;} g?(x, y) = 0, Vy G s U {v} 3x G s d(x, y) = 0. 

The former is immediate by picking y = x. For the latter, pick x = y if y ^ v, otherwise 
pick x = u. 

Case (ii) is really the same case as (i), with s — {v} in (ii) playing the role of s in (i). □ 
Lemma 6.9. 

(i) // s is closed and all elements of s are closed, then all elements of s are singular. 

(ii) Every singular set is closed. 

Proof. 

(i) Suppose ii £ s and d(u, v) = 0. Then v G s, since s is closed. By Lemma 16.11 
cl(n) = cl(v). But u and v are both closed, so u = v. 

(ii) By Lemma 16. 11 d(cl(u),u) = 0, so if u singular then u = cl(u). □ 

Theorem 6.10. A set is hereditarily closed if and only if it is closed and all its elements 
are singular. 

Proof. This follows directly from Lemmas 16.81 and 16.91 □ 

Theorem 6.11. A set is singular if and only if it is hereditarily finite. 

Proof. Suppose first that s is hereditarily finite (HF). Consider the binary relation on sets 
s, t defined by 

HF(s) Ad(s,t) =0. (6.5) 

We have 

HF(s) Ad(s,t) = ^ Vv G t Ve > 3u G s HF(it) Ad(u,v) < e 

Vv£t3ues HF(tt) A d(u, v) = 0, (6.6) 

since u is finite. It follows that 

HF(s) AHF(t) Ad(s,t) = ^> V-u G s 3v G t HF(u) A HF(u) A d(u, v) = 

A Vu G i 3n G s HF(u) A HF(d) A d(u, v) = 0, 

so the binary relation HF(s) A HF(t) A d(s,t) = is a bisimulation; thus 

HF(s) A HF(t) A d(s, t) = s = t. 

Thus if HF(s), then there is a positive lower bound 5 > on d(u, v) for u,v G s, u ^ v. But 
then 

HF(s) Ad(s,t) = ^ Vu G s Ve > 3v G t HF(it) Arf(u,u) < e 

V-u G s 3v G i HF(u) A d(n, v) < 5, 
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and using (|6.6j) . this gives 

HF(s) Ad(s,t) = 

Vu G s 3v G t HF(u) A d(u, v) < 5 A 3w G s v) = 

=> Vu G s 3w G t 3w E s HF(u) A d(w, v) = A d(u, u>) < 5 

Vu G s 3u G t 3w G s HF(u) A (f(u, u) = 0Am = w 

Vu G s 3u G t HF(u) A (f(u, u) = 0. 

This combined with (|6.6p says that the relation (|6.5p itself is a bisimulation. Thus HF(s) A 
d(s,t) = implies s = i; in other words, HF(s) implies that s is singular. 

Now suppose that s is singular. By Lemma f6.8l s is hereditarily singular. We argue that 
s must be finite. If s is infinite, then by Lemma 16.71 s nas a limit point p (not necessarily 
contained in s). We claim that (i) if p g" s, then d(s,s U {p}) = 0, and (ii) if p G s, then 

s — {p}) = 0, thus in either case s is not singular. For (i), 

d(s, s U {p}) = O Vu G s Ve > 3v G s U {p} d(u, u) < e 

A Vw G s U {p} Ve > 3u G s d(u, v) < e. 

The first clause is true by taking v = u. For the second clause, we can take u = v unless 
v = p. But if v = p, the condition reduces to 

Ve > 3u G s d(u,p) < e, 

which is true by Lemma 16.51 

Case (ii) is really the same as case (i), with s — {p} in (ii) playing the role of s in (i). □ 

Theorem 6.12. Every hereditarily finite set is heretarily closed, and every hereditarily 
closed set is canonical. Both implications are strict. 

Proof. The first implication HF(s) => HC(s) follows directly from Lemma l6.9f ii) and The- 
orem 16.111 

For the implication HC(s) =4» s = F(s), one approach would be to show that the binary 
relation on sets s,t defined by HC(s) At = F(s) is a bisimulation. Alternatively, we can 
observe that on hereditarily closed sets s, the coinductive definition 

F(s) = {F{u) | u G cl(s)} 
is equivalent to the coinductive definition 

F{s) = {F(u) j u G s}, 

which uniquely defines the identity function, thus s = F(s) on all such sets. 

Both implications are strict. An hereditarily closed set that is not finite is {n | n > 
0} U f2, and a canonical set that is not closed is {{n | n > 0} U f2}. □ 

7. Conclusions and Future Work 

We have illustrated the use of the metric coinduction principle in four areas: infinite 
streams, Markov chains, Markov decision processes, and non-well-founded sets. In all these 
areas, metric coinduction can be used to simplify proofs or derive new insights. 

Other areas are likely to be amenable to such techniques. In particular, iterated function 
systems seem to be a promising candidate. 
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